Install deliberr package

# install.packages("gumbelino/deliberr")
library(deliberr)
## Warning: replacing previous import 'ggplot2::alpha' by 'psych::alpha' when
## loading 'deliberr'
lsf.str("package:deliberr")
## get_dri : function (ic, adjusted = TRUE)  
## get_dri_alpha : function (data)  
## get_dri_ic : function (data)  
## get_dri_ind : function (ic)  
## permute_dri : function (data, iterations = 10000, verbose = FALSE, summary = TRUE)  
## plot_dri_ic : function (ic, title = NA, suffix = NA, dri = NA)  
## summarize_perm_dri : function (perms, type = "common")

Overview of data for analysis of LLM roles

Large-Language Models (LLMs) Preview

LLMs
Provider Model Class
1 anthropic claude-3-5-sonnet-20241022 top
2 anthropic claude-3-7-sonnet-20250219 top
3 xai grok-3-beta top
4 google gemini-2.5-flash new
5 anthropic claude-3-haiku-20240307 bottom
6 cohere command-r-08-2024 bottom
7 openai gpt-3.5-turbo bottom
8 openai gpt-4o-mini bottom

Building on our previous analysis, we selected only top models. gemini-2.5-flash substituted the gemini-2.5-pro-preview-03-25, which is not supported anymore. The pro version was too slow to include in this analysis, so it was replaced with the flash version.

Cases

Deliberative Cases
case survey N topic subtopic
1 UBC Bio biobanking_mayo_ubc 17 genomics genomics
2 CCPS ACT Deliberative ccps 31 climate climate
3 CSIRO WA energy_futures 17 climate energy
4 FNQCJ fnqcj 11 climate transportation
5 Forest Lay Citizen forestera 9 climate forest
6 Fremantle fremantle 41 transportation transportation
7 Activate uppsala_speaks 26 immigration immigration
8 Standard uppsala_speaks 22 immigration immigration
9 Winterthur zh_winterthur 16 climate climate

Also building on our previous analysis, we selected only deliberative cases.

Surveys

Surveys
survey considerations policies scale_max q_method
1 biobanking_mayo_ubc 38 7 11 FALSE
2 ccps 33 7 11 FALSE
3 energy_futures 45 9 11 FALSE
4 fnqcj 42 5 12 FALSE
5 forestera 45 7 11 FALSE
6 fremantle 36 6 11 TRUE
7 uppsala_speaks 42 7 7 FALSE
8 zh_winterthur 30 6 7 FALSE

Note that two of the cases share the same survey.

Roles (System Prompts)

Number of Prompts by Type
type n
devils 1
ideology 10
perspective 10
System Prompts
uid type role description
1 csk devils climate skeptic prioritizes economic growth over CO2 emission cuts, fossil fuels over renewable energy, and does not believe in climate science
2 ana ideology anarchist rejects all coercive authority and hierarchical government, advocating stateless, voluntary societies
3 con ideology conservative seeks to preserve traditional institutions, customs, and values, favoring order and gradual change
4 eco ideology ecologist focuses on environmental protection and sustainability, advocating for societal change to ecological limits
5 fas ideology fascist promotes extreme nationalism, authoritarianism, militarism, and a totalitarian state
6 fem ideology feminist advocates for gender equality, challenging patriarchal structures and discrimination against women
7 fun ideology fundamentalist adheres strictly to core beliefs, often religious, applying these principles to all life aspects
8 lib ideology liberal advocates individual liberty, rights, limited government, and free markets, emphasizing individual autonomy
9 nat ideology nationalist prioritizes the interests and identity of a particular nation, often seeking self-determination
10 pop ideology populist appeals directly to “the people” against a perceived corrupt elite using anti-establishment rhetoric
11 soc ideology socialist aims for social ownership or control of production, emphasizing equality and collective welfare
12 coa perspective coastal resident endures chronic flooding and salinization, forced to relocate due to rising sea levels and intense storms worsened by climate change
13 ctr perspective construction worker suffers from extreme heat stress and lost work hours, perceiving climate change making outdoor labor unbearable and life-threatening
14 dis perspective disease survivor recovers from dengue fever, aware that climate change’s rising temperatures are expanding the range of disease-carrying mosquitoes in their region
15 eld perspective elderly urban resident endures intensified city heatwaves, struggling with disrupted services and feeling the direct, severe impact of climate change
16 far perspective displaced family loses their home due to unprecedented wildfires, experiencing displacement and recognizing climate change as the major driver of the devastation
17 fis perspective fisher notes his declining catches due to warming oceans, understanding that climate change is reorganizing marine life and reducing their traditional yield
18 lan perspective landowner surveys his parched fields after a prolonged drought, feeling the compounding impacts of climate change that reduce crop yields and family income
19 par perspective parent sees their child fall ill from a water-borne disease, attributing its spread to the increased heavy rainfall and warmer temperatures brought by climate change
20 sub perspective subsistence farmer watches his crops wither under erratic rainfall patterns, and who sees these changes as direct consequence of climate change
21 vil perspective villager faces dwindling, contaminated water supplies due to extended draughts and floods, aware that climate change is altering their water security

Summary of LLM Data Collection

We collected a total of 6720 LLM responses from 8 models across 8 surveys and 21 roles. We prompted each LLM 5 times with the same prompt.

Climate Analysis

Subset of cases used in the climate analysis
case survey N topic subtopic
1 CCPS ACT Deliberative ccps 31 climate climate
2 CSIRO WA energy_futures 17 climate energy
3 Winterthur zh_winterthur 16 climate climate
Subset of roles used in the climate analysis
uid type article role description
1 eco ideology an ecologist focuses on environmental protection and sustainability, advocating for societal change to ecological limits
2 coa perspective a coastal resident endures chronic flooding and salinization, forced to relocate due to rising sea levels and intense storms worsened by climate change
3 ctr perspective a construction worker suffers from extreme heat stress and lost work hours, perceiving climate change making outdoor labor unbearable and life-threatening
4 dis perspective a disease survivor recovers from dengue fever, aware that climate change’s rising temperatures are expanding the range of disease-carrying mosquitoes in their region
5 eld perspective an elderly urban resident endures intensified city heatwaves, struggling with disrupted services and feeling the direct, severe impact of climate change
6 far perspective a displaced family loses their home due to unprecedented wildfires, experiencing displacement and recognizing climate change as the major driver of the devastation
7 fis perspective a fisher notes his declining catches due to warming oceans, understanding that climate change is reorganizing marine life and reducing their traditional yield
8 lan perspective a landowner surveys his parched fields after a prolonged drought, feeling the compounding impacts of climate change that reduce crop yields and family income
9 par perspective a parent sees their child fall ill from a water-borne disease, attributing its spread to the increased heavy rainfall and warmer temperatures brought by climate change
10 sub perspective a subsistence farmer watches his crops wither under erratic rainfall patterns, and who sees these changes as direct consequence of climate change
11 vil perspective a villager faces dwindling, contaminated water supplies due to extended draughts and floods, aware that climate change is altering their water security
12 csk devils a climate skeptic prioritizes economic growth over CO2 emission cuts, fossil fuels over renewable energy, and does not believe in climate science
## # A tibble: 8 × 2
##   model                          n
##   <chr>                      <int>
## 1 claude-3-5-sonnet-20241022   180
## 2 claude-3-7-sonnet-20250219   180
## 3 claude-3-haiku-20240307      180
## 4 command-r-08-2024            180
## 5 gemini-2.5-flash             180
## 6 gpt-3.5-turbo                180
## 7 gpt-4o-mini                  180
## 8 grok-3-beta                  180

For the climate analysis, we selected a subset of 1440 responses generated by 8 models cross 3 surveys and 12 roles described above. We prompted each LLM 5 times with the same prompt.

We calculated one DRI value per model/survey/role by treating each LLM response as one participant in a deliberation. The role “all” indicates that all roles were part of that deliberation (n = 60 participants, which equals 5 participants for each of the 12 roles). See example below.

Consistency results

Top models

Head (5) of DRI consistency cross climate roles
model survey role dri alpha_c alpha_p alpha_all n
claude-3-5-sonnet-20241022 ccps all 0.417 0.991 0.623 0.988 60
claude-3-5-sonnet-20241022 ccps coa 0.437 0.792 0.590 0.836 5
claude-3-5-sonnet-20241022 ccps csk 0.158 0.768 0.778 0.746 5
claude-3-5-sonnet-20241022 ccps ctr 0.380 0.942 0.740 0.934 5
claude-3-5-sonnet-20241022 ccps dis 0.468 0.866 0.733 0.868 5

Bottom models

Note that each role has 12 data points: 3 surveys x 4 models.

We found that LLMs are consistent across roles both in terms of DRI and Cronbach’s Alpha (policies). The high DRI across roles (median = -0.177; IQR = 0.163) suggests that LLMs tend to consistenly align their considerations and policy preferences. The high Cronbach’s alpha for their policy preferences (median = 0.635; IQR = 0.11) suggests that LLMs tend to agree on the ranking of their policy preferences.

Summary for model

Mean DRI across models and roles
role claude-3-5-sonnet-20241022 claude-3-7-sonnet-20250219 claude-3-haiku-20240307 command-r-08-2024 gemini-2.5-flash gpt-3.5-turbo gpt-4o-mini grok-3-beta best
1 all 0.512 0.639 -0.291 -0.281 0.638 -0.213 0.000 0.625 claude-3-7-sonnet-20250219
2 coa 0.350 0.565 -0.526 -0.435 0.810 -0.315 -0.019 0.567 gemini-2.5-flash
3 csk 0.543 0.773 -0.118 -0.580 0.875 0.163 -0.153 0.795 gemini-2.5-flash
4 ctr 0.343 0.567 -0.368 -0.264 0.663 -0.129 0.252 0.447 gemini-2.5-flash
5 dis 0.476 0.538 -0.553 -0.490 0.569 -0.719 0.057 0.455 gemini-2.5-flash
6 eco 0.364 0.720 -0.281 -0.831 0.854 -0.472 0.084 0.696 gemini-2.5-flash
7 eld 0.404 0.498 -0.335 -0.396 0.796 -0.078 -0.322 0.626 gemini-2.5-flash
8 far 0.479 0.651 -0.524 -0.673 0.821 -0.388 -0.370 0.497 gemini-2.5-flash
9 fis 0.497 0.593 -0.492 -0.560 0.685 -0.665 -0.244 0.602 gemini-2.5-flash
10 lan 0.595 0.633 -0.318 -0.347 0.477 -0.466 0.199 0.587 claude-3-7-sonnet-20250219
11 par 0.498 0.708 -0.669 -0.472 0.598 -0.164 -0.284 0.670 claude-3-7-sonnet-20250219
12 sub 0.526 0.712 -0.433 -0.218 0.556 -0.106 -0.014 0.654 claude-3-7-sonnet-20250219
13 vil 0.581 0.604 -0.612 -0.550 0.407 -0.490 -0.252 0.613 grok-3-beta

Summary Cronbach’s Alpha (Policies)

Mean alpha (policies) across models and roles
role claude-3-5-sonnet-20241022 claude-3-7-sonnet-20250219 claude-3-haiku-20240307 command-r-08-2024 gemini-2.5-flash gpt-3.5-turbo gpt-4o-mini grok-3-beta best
1 all 0.725 0.792 0.614 0.638 0.801 0.599 0.641 0.818 grok-3-beta
2 coa 0.713 0.745 0.816 0.808 0.771 0.737 0.763 0.807 claude-3-haiku-20240307
3 csk 0.783 0.802 0.813 0.708 0.848 0.764 0.715 0.851 grok-3-beta
4 ctr 0.749 0.791 0.774 0.776 0.918 0.787 0.727 0.755 gemini-2.5-flash
5 dis 0.761 0.772 0.669 0.802 0.771 0.762 0.756 0.796 command-r-08-2024
6 eco 0.764 0.844 0.711 0.730 0.814 0.800 0.759 0.716 claude-3-7-sonnet-20250219
7 eld 0.722 0.793 0.788 0.740 0.741 0.801 0.813 0.828 grok-3-beta
8 far 0.726 0.807 0.791 0.843 0.827 0.769 0.828 0.824 command-r-08-2024
9 fis 0.787 0.792 0.690 0.793 0.829 0.750 0.825 0.704 gemini-2.5-flash
10 lan 0.715 0.792 0.802 0.805 0.789 0.783 0.795 0.792 command-r-08-2024
11 par 0.785 0.704 0.774 0.777 0.790 0.778 0.762 0.833 grok-3-beta
12 sub 0.841 0.800 0.671 0.754 0.761 0.760 0.803 0.839 claude-3-5-sonnet-20241022
13 vil 0.708 0.818 0.770 0.794 0.808 0.786 0.798 0.662 claude-3-7-sonnet-20250219

Summary Cronbach’s Alpha (Consideration)

Mean alpha (considerations) across models and roles
role claude-3-5-sonnet-20241022 claude-3-7-sonnet-20250219 claude-3-haiku-20240307 command-r-08-2024 gemini-2.5-flash gpt-3.5-turbo gpt-4o-mini grok-3-beta best
1 all 0.990 0.990 0.976 0.975 0.984 0.911 0.976 0.987 claude-3-5-sonnet-20241022
2 coa 0.863 0.918 0.880 0.787 0.849 0.886 0.837 0.891 claude-3-7-sonnet-20250219
3 csk 0.769 0.856 0.898 0.767 0.551 0.952 0.817 0.831 gpt-3.5-turbo
4 ctr 0.916 0.909 0.872 0.915 0.852 0.916 0.852 0.906 claude-3-5-sonnet-20241022
5 dis 0.905 0.921 0.894 0.904 0.859 0.918 0.876 0.896 claude-3-7-sonnet-20250219
6 eco 0.900 0.860 0.884 0.827 0.842 0.865 0.871 0.863 claude-3-5-sonnet-20241022
7 eld 0.917 0.899 0.919 0.886 0.917 0.911 0.879 0.903 claude-3-haiku-20240307
8 far 0.905 0.848 0.919 0.747 0.815 0.774 0.860 0.905 claude-3-haiku-20240307
9 fis 0.916 0.895 0.894 0.907 0.896 0.918 0.891 0.905 gpt-3.5-turbo
10 lan 0.917 0.914 0.884 0.904 0.884 0.885 0.909 0.917 claude-3-5-sonnet-20241022
11 par 0.925 0.905 0.863 0.867 0.830 0.888 0.885 0.922 claude-3-5-sonnet-20241022
12 sub 0.902 0.919 0.895 0.758 0.851 0.889 0.906 0.911 claude-3-7-sonnet-20250219
13 vil 0.881 0.880 0.914 0.901 0.873 0.927 0.895 0.887 gpt-3.5-turbo

Detailed data

DRI consistency cross 12 climate roles
model survey role dri alpha_c alpha_p alpha_all n
1 claude-3-5-sonnet-20241022 ccps all 0.417 0.991 0.623 0.988 60
2 claude-3-5-sonnet-20241022 ccps coa 0.437 0.792 0.590 0.836 5
3 claude-3-5-sonnet-20241022 ccps csk 0.158 0.768 0.778 0.746 5
4 claude-3-5-sonnet-20241022 ccps ctr 0.380 0.942 0.740 0.934 5
5 claude-3-5-sonnet-20241022 ccps dis 0.468 0.866 0.733 0.868 5
6 claude-3-5-sonnet-20241022 ccps eco 0.340 0.863 0.757 0.898 5
7 claude-3-5-sonnet-20241022 ccps eld 0.322 0.909 0.673 0.901 5
8 claude-3-5-sonnet-20241022 ccps far 0.434 0.901 0.632 0.916 5
9 claude-3-5-sonnet-20241022 ccps fis 0.424 0.941 0.776 0.928 5
10 claude-3-5-sonnet-20241022 ccps lan 0.457 0.933 0.689 0.923 5
11 claude-3-5-sonnet-20241022 ccps par 0.520 0.915 0.728 0.896 5
12 claude-3-5-sonnet-20241022 ccps sub -0.029 0.870 0.798 0.883 5
13 claude-3-5-sonnet-20241022 ccps vil 0.600 0.866 0.791 0.802 5
14 claude-3-5-sonnet-20241022 energy_futures all 0.497 0.989 0.772 0.988 60
15 claude-3-5-sonnet-20241022 energy_futures coa 0.167 0.881 0.771 0.903 5
16 claude-3-5-sonnet-20241022 energy_futures csk 0.869 0.915 0.726 0.919 5
17 claude-3-5-sonnet-20241022 energy_futures ctr -0.023 0.896 0.685 0.885 5
18 claude-3-5-sonnet-20241022 energy_futures dis 0.477 0.922 0.763 0.917 5
19 claude-3-5-sonnet-20241022 energy_futures eco 0.326 0.950 0.679 0.933 5
20 claude-3-5-sonnet-20241022 energy_futures eld 0.246 0.909 0.693 0.929 5
21 claude-3-5-sonnet-20241022 energy_futures far 0.553 0.942 0.767 0.947 5
22 claude-3-5-sonnet-20241022 energy_futures fis 0.436 0.915 0.786 0.935 5
23 claude-3-5-sonnet-20241022 energy_futures lan 0.645 0.951 0.658 0.952 5
24 claude-3-5-sonnet-20241022 energy_futures par 0.535 0.919 0.776 0.939 5
25 claude-3-5-sonnet-20241022 energy_futures sub 0.846 0.922 0.882 0.921 5
26 claude-3-5-sonnet-20241022 energy_futures vil 0.558 0.928 0.517 0.931 5
27 claude-3-5-sonnet-20241022 zh_winterthur all 0.624 0.989 0.780 0.988 60
28 claude-3-5-sonnet-20241022 zh_winterthur coa 0.447 0.916 0.778 0.845 5
29 claude-3-5-sonnet-20241022 zh_winterthur csk 0.601 0.623 0.845 0.820 5
30 claude-3-5-sonnet-20241022 zh_winterthur ctr 0.672 0.912 0.822 0.901 5
31 claude-3-5-sonnet-20241022 zh_winterthur dis 0.484 0.927 0.786 0.914 5
32 claude-3-5-sonnet-20241022 zh_winterthur eco 0.425 0.887 0.855 0.859 5
33 claude-3-5-sonnet-20241022 zh_winterthur eld 0.645 0.933 0.799 0.892 5
34 claude-3-5-sonnet-20241022 zh_winterthur far 0.449 0.870 0.778 0.839 5
35 claude-3-5-sonnet-20241022 zh_winterthur fis 0.631 0.893 0.799 0.900 5
36 claude-3-5-sonnet-20241022 zh_winterthur lan 0.683 0.868 0.799 0.867 5
37 claude-3-5-sonnet-20241022 zh_winterthur par 0.440 0.941 0.850 0.910 5
38 claude-3-5-sonnet-20241022 zh_winterthur sub 0.761 0.913 0.844 0.901 5
39 claude-3-5-sonnet-20241022 zh_winterthur vil 0.584 0.847 0.816 0.865 5
40 claude-3-7-sonnet-20250219 ccps all 0.676 0.990 0.775 0.989 60
41 claude-3-7-sonnet-20250219 ccps coa 0.683 0.874 0.717 0.908 5
42 claude-3-7-sonnet-20250219 ccps csk 0.719 0.813 0.855 0.863 5
43 claude-3-7-sonnet-20250219 ccps ctr 0.769 0.951 0.682 0.944 5
44 claude-3-7-sonnet-20250219 ccps dis 0.544 0.927 0.732 0.916 5
45 claude-3-7-sonnet-20250219 ccps eco 0.867 0.862 0.887 0.873 5
46 claude-3-7-sonnet-20250219 ccps eld 0.576 0.890 0.732 0.899 5
47 claude-3-7-sonnet-20250219 ccps far 0.785 0.759 0.682 0.853 5
48 claude-3-7-sonnet-20250219 ccps fis 0.582 0.899 0.819 0.888 5
49 claude-3-7-sonnet-20250219 ccps lan 0.523 0.917 0.764 0.902 5
50 claude-3-7-sonnet-20250219 ccps par 0.770 0.902 0.682 0.921 5
51 claude-3-7-sonnet-20250219 ccps sub 0.780 0.920 0.682 0.923 5
52 claude-3-7-sonnet-20250219 ccps vil 0.585 0.817 0.798 0.872 5
53 claude-3-7-sonnet-20250219 energy_futures all 0.591 0.988 0.814 0.988 60
54 claude-3-7-sonnet-20250219 energy_futures coa 0.560 0.935 0.741 0.941 5
55 claude-3-7-sonnet-20250219 energy_futures csk 0.801 0.915 0.833 0.947 5
56 claude-3-7-sonnet-20250219 energy_futures ctr 0.420 0.902 0.842 0.929 5
57 claude-3-7-sonnet-20250219 energy_futures dis 0.568 0.911 0.689 0.901 5
58 claude-3-7-sonnet-20250219 energy_futures eco 0.774 0.859 0.706 0.901 5
59 claude-3-7-sonnet-20250219 energy_futures eld 0.288 0.930 0.789 0.942 5
60 claude-3-7-sonnet-20250219 energy_futures far 0.663 0.917 0.889 0.928 5
61 claude-3-7-sonnet-20250219 energy_futures fis 0.563 0.935 0.758 0.942 5
62 claude-3-7-sonnet-20250219 energy_futures lan 0.546 0.863 0.797 0.893 5
63 claude-3-7-sonnet-20250219 energy_futures par 0.813 0.924 0.598 0.921 5
64 claude-3-7-sonnet-20250219 energy_futures sub 0.791 0.936 0.849 0.949 5
65 claude-3-7-sonnet-20250219 energy_futures vil 0.622 0.910 0.798 0.924 5
66 claude-3-7-sonnet-20250219 zh_winterthur all 0.649 0.991 0.787 0.989 60
67 claude-3-7-sonnet-20250219 zh_winterthur coa 0.452 0.945 0.778 0.916 5
68 claude-3-7-sonnet-20250219 zh_winterthur csk 0.797 0.839 0.718 0.841 5
69 claude-3-7-sonnet-20250219 zh_winterthur ctr 0.512 0.874 0.848 0.894 5
70 claude-3-7-sonnet-20250219 zh_winterthur dis 0.504 0.924 0.894 0.880 5
71 claude-3-7-sonnet-20250219 zh_winterthur eco 0.517 0.860 0.939 0.877 5
72 claude-3-7-sonnet-20250219 zh_winterthur eld 0.630 0.875 0.857 0.854 5
73 claude-3-7-sonnet-20250219 zh_winterthur far 0.506 0.866 0.848 0.884 5
74 claude-3-7-sonnet-20250219 zh_winterthur fis 0.633 0.851 0.800 0.910 5
75 claude-3-7-sonnet-20250219 zh_winterthur lan 0.830 0.961 0.816 0.964 5
76 claude-3-7-sonnet-20250219 zh_winterthur par 0.543 0.888 0.833 0.812 5
77 claude-3-7-sonnet-20250219 zh_winterthur sub 0.564 0.902 0.870 0.929 5
78 claude-3-7-sonnet-20250219 zh_winterthur vil 0.606 0.912 0.857 0.914 5
79 claude-3-haiku-20240307 ccps all -0.171 0.977 0.707 0.972 60
80 claude-3-haiku-20240307 ccps coa -0.627 0.846 0.838 0.887 5
81 claude-3-haiku-20240307 ccps csk -0.115 0.863 0.825 0.860 5
82 claude-3-haiku-20240307 ccps ctr -0.044 0.801 0.771 0.891 5
83 claude-3-haiku-20240307 ccps dis -0.338 0.836 0.784 0.851 5
84 claude-3-haiku-20240307 ccps eco -0.050 0.835 0.781 0.880 5
85 claude-3-haiku-20240307 ccps eld -0.358 0.916 0.807 0.918 5
86 claude-3-haiku-20240307 ccps far -0.122 0.946 0.861 0.947 5
87 claude-3-haiku-20240307 ccps fis -0.509 0.887 0.674 0.882 5
88 claude-3-haiku-20240307 ccps lan -0.002 0.878 0.832 0.900 5
89 claude-3-haiku-20240307 ccps par -0.608 0.805 0.861 0.873 5
90 claude-3-haiku-20240307 ccps sub -0.517 0.884 0.624 0.893 5
91 claude-3-haiku-20240307 ccps vil -0.437 0.898 0.826 0.855 5
92 claude-3-haiku-20240307 energy_futures all -0.182 0.966 0.692 0.953 60
93 claude-3-haiku-20240307 energy_futures coa -0.553 0.895 0.829 0.891 5
94 claude-3-haiku-20240307 energy_futures csk 0.126 0.947 0.727 0.944 5
95 claude-3-haiku-20240307 energy_futures ctr -0.341 0.957 0.765 0.951 5
96 claude-3-haiku-20240307 energy_futures dis -0.343 0.918 0.546 0.902 5
97 claude-3-haiku-20240307 energy_futures eco -0.229 0.949 0.624 0.946 5
98 claude-3-haiku-20240307 energy_futures eld -0.414 0.919 0.764 0.910 5
99 claude-3-haiku-20240307 energy_futures far -0.591 0.920 0.766 0.923 5
100 claude-3-haiku-20240307 energy_futures fis -0.296 0.918 0.746 0.894 5
101 claude-3-haiku-20240307 energy_futures lan -0.608 0.921 0.843 0.899 5
102 claude-3-haiku-20240307 energy_futures par -0.392 0.913 0.711 0.896 5
103 claude-3-haiku-20240307 energy_futures sub -0.219 0.916 0.701 0.928 5
104 claude-3-haiku-20240307 energy_futures vil -0.442 0.954 0.750 0.930 5
105 claude-3-haiku-20240307 zh_winterthur all -0.520 0.984 0.444 0.969 60
106 claude-3-haiku-20240307 zh_winterthur coa -0.399 0.899 0.779 0.836 5
107 claude-3-haiku-20240307 zh_winterthur csk -0.364 0.884 0.886 0.919 5
108 claude-3-haiku-20240307 zh_winterthur ctr -0.717 0.857 0.785 0.896 5
109 claude-3-haiku-20240307 zh_winterthur dis -0.979 0.927 0.678 0.898 5
110 claude-3-haiku-20240307 zh_winterthur eco -0.564 0.868 0.730 0.857 5
111 claude-3-haiku-20240307 zh_winterthur eld -0.233 0.922 0.794 0.902 5
112 claude-3-haiku-20240307 zh_winterthur far -0.860 0.893 0.745 0.823 5
113 claude-3-haiku-20240307 zh_winterthur fis -0.670 0.877 0.652 0.859 5
114 claude-3-haiku-20240307 zh_winterthur lan -0.343 0.855 0.731 0.879 5
115 claude-3-haiku-20240307 zh_winterthur par -1.007 0.872 0.749 0.853 5
116 claude-3-haiku-20240307 zh_winterthur sub -0.564 0.886 0.687 0.891 5
117 claude-3-haiku-20240307 zh_winterthur vil -0.956 0.890 0.732 0.786 5
118 command-r-08-2024 ccps all -0.238 0.980 0.804 0.974 60
119 command-r-08-2024 ccps coa -0.352 0.627 0.909 0.608 5
120 command-r-08-2024 ccps csk -0.612 0.664 0.789 0.705 5
121 command-r-08-2024 ccps ctr -0.260 0.933 0.816 0.934 5
122 command-r-08-2024 ccps dis -0.405 0.871 0.819 0.887 5
123 command-r-08-2024 ccps eco -0.982 0.695 0.733 0.795 5
124 command-r-08-2024 ccps eld -0.633 0.855 0.700 0.869 5
125 command-r-08-2024 ccps far -0.490 0.566 0.894 0.757 5
126 command-r-08-2024 ccps fis -0.510 0.881 0.862 0.879 5
127 command-r-08-2024 ccps lan -0.350 0.794 0.876 0.819 5
128 command-r-08-2024 ccps par -0.174 0.880 0.894 0.920 5
129 command-r-08-2024 ccps sub -0.213 0.658 0.863 0.650 5
130 command-r-08-2024 ccps vil -0.671 0.900 0.761 0.926 5
131 command-r-08-2024 energy_futures all 0.075 0.956 0.429 0.952 60
132 command-r-08-2024 energy_futures coa 0.068 0.841 0.741 0.864 5
133 command-r-08-2024 energy_futures csk -0.194 0.800 0.687 0.843 5
134 command-r-08-2024 energy_futures ctr 0.080 0.946 0.699 0.952 5
135 command-r-08-2024 energy_futures dis -0.248 0.952 0.811 0.957 5
136 command-r-08-2024 energy_futures eco -0.366 0.921 0.722 0.938 5
137 command-r-08-2024 energy_futures eld 0.417 0.900 0.766 0.916 5
138 command-r-08-2024 energy_futures far -0.510 0.854 0.840 0.899 5
139 command-r-08-2024 energy_futures fis -0.020 0.911 0.725 0.924 5
140 command-r-08-2024 energy_futures lan 0.096 0.967 0.773 0.965 5
141 command-r-08-2024 energy_futures par -0.126 0.834 0.655 0.861 5
142 command-r-08-2024 energy_futures sub 0.465 0.896 0.619 0.908 5
143 command-r-08-2024 energy_futures vil 0.043 0.930 0.806 0.923 5
144 command-r-08-2024 zh_winterthur all -0.680 0.990 0.680 0.977 60
145 command-r-08-2024 zh_winterthur coa -1.020 0.892 0.773 0.890 5
146 command-r-08-2024 zh_winterthur csk -0.932 0.837 0.649 0.797 5
147 command-r-08-2024 zh_winterthur ctr -0.610 0.866 0.813 0.903 5
148 command-r-08-2024 zh_winterthur dis -0.818 0.891 0.777 0.882 5
149 command-r-08-2024 zh_winterthur eco -1.146 0.866 0.736 0.903 5
150 command-r-08-2024 zh_winterthur eld -0.972 0.902 0.755 0.900 5
151 command-r-08-2024 zh_winterthur far -1.019 0.820 0.796 0.875 5
152 command-r-08-2024 zh_winterthur fis -1.149 0.930 0.793 0.885 5
153 command-r-08-2024 zh_winterthur lan -0.788 0.951 0.765 0.931 5
154 command-r-08-2024 zh_winterthur par -1.117 0.889 0.781 0.884 5
155 command-r-08-2024 zh_winterthur sub -0.906 0.722 0.780 0.831 5
156 command-r-08-2024 zh_winterthur vil -1.021 0.872 0.814 0.905 5
157 gemini-2.5-flash ccps all 0.711 0.982 0.765 0.982 60
158 gemini-2.5-flash ccps coa 0.854 0.750 0.889 0.805 5
159 gemini-2.5-flash ccps csk 0.895 0.606 0.722 0.718 5
160 gemini-2.5-flash ccps ctr 0.784 0.782 0.948 0.839 5
161 gemini-2.5-flash ccps dis 0.942 0.880 0.831 0.889 5
162 gemini-2.5-flash ccps eco 0.826 0.841 0.940 0.872 5
163 gemini-2.5-flash ccps eld 0.848 0.848 0.647 0.876 5
164 gemini-2.5-flash ccps far 0.937 0.702 0.750 0.694 5
165 gemini-2.5-flash ccps fis 0.899 0.874 0.750 0.882 5
166 gemini-2.5-flash ccps lan 0.688 0.833 0.781 0.844 5
167 gemini-2.5-flash ccps par 0.703 0.869 0.706 0.883 5
168 gemini-2.5-flash ccps sub 0.875 0.861 0.844 0.891 5
169 gemini-2.5-flash ccps vil 0.753 0.882 0.725 0.912 5
170 gemini-2.5-flash energy_futures all 0.527 0.981 0.825 0.982 60
171 gemini-2.5-flash energy_futures coa 0.906 0.928 0.625 0.941 5
172 gemini-2.5-flash energy_futures csk 0.809 0.859 0.907 0.902 5
173 gemini-2.5-flash energy_futures ctr 0.620 0.893 0.857 0.915 5
174 gemini-2.5-flash energy_futures dis 0.313 0.895 0.681 0.907 5
175 gemini-2.5-flash energy_futures eco 0.853 0.866 0.753 0.893 5
176 gemini-2.5-flash energy_futures eld 0.771 0.939 0.871 0.954 5
177 gemini-2.5-flash energy_futures far 0.827 0.870 0.821 0.901 5
178 gemini-2.5-flash energy_futures fis 0.486 0.930 0.884 0.923 5
179 gemini-2.5-flash energy_futures lan 0.138 0.934 0.759 0.944 5
180 gemini-2.5-flash energy_futures par 0.220 0.900 0.823 0.910 5
181 gemini-2.5-flash energy_futures sub 0.375 0.850 0.733 0.893 5
182 gemini-2.5-flash energy_futures vil -0.119 0.910 0.892 0.932 5
183 gemini-2.5-flash zh_winterthur all 0.677 0.989 0.814 0.988 60
184 gemini-2.5-flash zh_winterthur coa 0.671 0.868 0.800 0.868 5
185 gemini-2.5-flash zh_winterthur csk 0.922 0.189 0.917 0.830 5
186 gemini-2.5-flash zh_winterthur ctr 0.586 0.880 0.949 0.912 5
187 gemini-2.5-flash zh_winterthur dis 0.451 0.803 0.800 0.885 5
188 gemini-2.5-flash zh_winterthur eco 0.882 0.819 0.750 0.843 5
189 gemini-2.5-flash zh_winterthur eld 0.769 0.962 0.704 0.961 5
190 gemini-2.5-flash zh_winterthur far 0.700 0.871 0.909 0.906 5
191 gemini-2.5-flash zh_winterthur fis 0.671 0.885 0.853 0.925 5
192 gemini-2.5-flash zh_winterthur lan 0.605 0.886 0.825 0.931 5
193 gemini-2.5-flash zh_winterthur par 0.872 0.722 0.840 0.851 5
194 gemini-2.5-flash zh_winterthur sub 0.419 0.843 0.705 0.843 5
195 gemini-2.5-flash zh_winterthur vil 0.588 0.825 0.806 0.863 5
196 gpt-3.5-turbo ccps all -0.218 0.898 0.554 0.890 60
197 gpt-3.5-turbo ccps coa -0.303 0.788 0.813 0.863 5
198 gpt-3.5-turbo ccps csk 0.128 0.961 0.612 0.961 5
199 gpt-3.5-turbo ccps ctr 0.237 0.937 0.797 0.937 5
200 gpt-3.5-turbo ccps dis -0.689 0.883 0.753 0.921 5
201 gpt-3.5-turbo ccps eco -0.335 0.842 0.842 0.899 5
202 gpt-3.5-turbo ccps eld 0.211 0.893 0.853 0.883 5
203 gpt-3.5-turbo ccps far -0.632 0.552 0.734 0.681 5
204 gpt-3.5-turbo ccps fis -0.737 0.891 0.802 0.893 5
205 gpt-3.5-turbo ccps lan -0.406 0.825 0.718 0.842 5
206 gpt-3.5-turbo ccps par -0.021 0.922 0.833 0.938 5
207 gpt-3.5-turbo ccps sub -0.248 0.905 0.780 0.894 5
208 gpt-3.5-turbo ccps vil -0.243 0.920 0.811 0.928 5
209 gpt-3.5-turbo energy_futures all -0.137 0.893 0.640 0.895 60
210 gpt-3.5-turbo energy_futures coa -0.053 0.951 0.687 0.952 5
211 gpt-3.5-turbo energy_futures csk 0.277 0.981 0.907 0.982 5
212 gpt-3.5-turbo energy_futures ctr -0.231 0.949 0.785 0.954 5
213 gpt-3.5-turbo energy_futures dis -0.518 0.949 0.779 0.951 5
214 gpt-3.5-turbo energy_futures eco -0.420 0.859 0.902 0.888 5
215 gpt-3.5-turbo energy_futures eld -0.336 0.928 0.749 0.927 5
216 gpt-3.5-turbo energy_futures far -0.075 0.890 0.822 0.923 5
217 gpt-3.5-turbo energy_futures fis -0.611 0.946 0.707 0.953 5
218 gpt-3.5-turbo energy_futures lan -0.623 0.966 0.793 0.968 5
219 gpt-3.5-turbo energy_futures par 0.057 0.946 0.749 0.950 5
220 gpt-3.5-turbo energy_futures sub -0.260 0.876 0.728 0.900 5
221 gpt-3.5-turbo energy_futures vil -0.611 0.910 0.774 0.912 5
222 gpt-3.5-turbo zh_winterthur all -0.284 0.941 0.604 0.934 60
223 gpt-3.5-turbo zh_winterthur coa -0.589 0.919 0.709 0.917 5
224 gpt-3.5-turbo zh_winterthur csk 0.084 0.914 0.772 0.912 5
225 gpt-3.5-turbo zh_winterthur ctr -0.394 0.863 0.780 0.903 5
226 gpt-3.5-turbo zh_winterthur dis -0.949 0.922 0.755 0.907 5
227 gpt-3.5-turbo zh_winterthur eco -0.662 0.892 0.657 0.890 5
228 gpt-3.5-turbo zh_winterthur eld -0.108 0.913 0.802 0.898 5
229 gpt-3.5-turbo zh_winterthur far -0.457 0.881 0.750 0.898 5
230 gpt-3.5-turbo zh_winterthur fis -0.647 0.919 0.742 0.914 5
231 gpt-3.5-turbo zh_winterthur lan -0.370 0.863 0.836 0.904 5
232 gpt-3.5-turbo zh_winterthur par -0.527 0.796 0.753 0.880 5
233 gpt-3.5-turbo zh_winterthur sub 0.190 0.886 0.773 0.891 5
234 gpt-3.5-turbo zh_winterthur vil -0.616 0.952 0.773 0.919 5
235 gpt-4o-mini ccps all 0.123 0.977 0.699 0.974 60
236 gpt-4o-mini ccps coa 0.313 0.784 0.804 0.851 5
237 gpt-4o-mini ccps csk -0.366 0.911 0.590 0.906 5
238 gpt-4o-mini ccps ctr 0.582 0.791 0.774 0.886 5
239 gpt-4o-mini ccps dis 0.334 0.838 0.740 0.876 5
240 gpt-4o-mini ccps eco 0.490 0.845 0.821 0.905 5
241 gpt-4o-mini ccps eld -0.185 0.873 0.912 0.910 5
242 gpt-4o-mini ccps far -0.197 0.838 0.912 0.901 5
243 gpt-4o-mini ccps fis -0.331 0.878 0.869 0.920 5
244 gpt-4o-mini ccps lan 0.536 0.873 0.689 0.892 5
245 gpt-4o-mini ccps par 0.012 0.875 0.748 0.913 5
246 gpt-4o-mini ccps sub -0.347 0.892 0.919 0.936 5
247 gpt-4o-mini ccps vil -0.327 0.860 0.862 0.920 5
248 gpt-4o-mini energy_futures all -0.010 0.967 0.594 0.960 60
249 gpt-4o-mini energy_futures coa -0.326 0.906 0.808 0.899 5
250 gpt-4o-mini energy_futures csk -0.294 0.859 0.783 0.827 5
251 gpt-4o-mini energy_futures ctr 0.369 0.913 0.654 0.926 5
252 gpt-4o-mini energy_futures dis -0.243 0.920 0.757 0.902 5
253 gpt-4o-mini energy_futures eco -0.498 0.904 0.699 0.915 5
254 gpt-4o-mini energy_futures eld -0.307 0.922 0.786 0.921 5
255 gpt-4o-mini energy_futures far -0.294 0.916 0.799 0.914 5
256 gpt-4o-mini energy_futures fis -0.140 0.929 0.707 0.931 5
257 gpt-4o-mini energy_futures lan -0.082 0.948 0.802 0.957 5
258 gpt-4o-mini energy_futures par -0.219 0.918 0.778 0.914 5
259 gpt-4o-mini energy_futures sub 0.599 0.956 0.753 0.959 5
260 gpt-4o-mini energy_futures vil 0.353 0.934 0.789 0.930 5
261 gpt-4o-mini zh_winterthur all -0.112 0.983 0.629 0.977 60
262 gpt-4o-mini zh_winterthur coa -0.044 0.820 0.677 0.857 5
263 gpt-4o-mini zh_winterthur csk 0.201 0.683 0.773 0.668 5
264 gpt-4o-mini zh_winterthur ctr -0.195 0.852 0.752 0.857 5
265 gpt-4o-mini zh_winterthur dis 0.080 0.869 0.772 0.858 5
266 gpt-4o-mini zh_winterthur eco 0.260 0.865 0.758 0.876 5
267 gpt-4o-mini zh_winterthur eld -0.475 0.841 0.740 0.807 5
268 gpt-4o-mini zh_winterthur far -0.618 0.826 0.774 0.871 5
269 gpt-4o-mini zh_winterthur fis -0.262 0.867 0.898 0.927 5
270 gpt-4o-mini zh_winterthur lan 0.143 0.906 0.893 0.923 5
271 gpt-4o-mini zh_winterthur par -0.644 0.864 0.758 0.848 5
272 gpt-4o-mini zh_winterthur sub -0.293 0.871 0.738 0.877 5
273 gpt-4o-mini zh_winterthur vil -0.782 0.891 0.744 0.907 5
274 grok-3-beta ccps all 0.427 0.990 0.731 0.987 60
275 grok-3-beta ccps coa 0.245 0.862 0.856 0.910 5
276 grok-3-beta ccps csk 0.786 0.900 0.917 0.921 5
277 grok-3-beta ccps ctr 0.223 0.924 0.870 0.940 5
278 grok-3-beta ccps dis 0.237 0.909 0.882 0.923 5
279 grok-3-beta ccps eco 0.666 0.877 0.855 0.859 5
280 grok-3-beta ccps eld 0.304 0.885 0.828 0.883 5
281 grok-3-beta ccps far 0.224 0.872 0.882 0.919 5
282 grok-3-beta ccps fis 0.325 0.885 0.837 0.926 5
283 grok-3-beta ccps lan 0.384 0.923 0.815 0.916 5
284 grok-3-beta ccps par 0.232 0.899 0.882 0.923 5
285 grok-3-beta ccps sub 0.378 0.918 0.853 0.937 5
286 grok-3-beta ccps vil 0.323 0.921 0.837 0.912 5
287 grok-3-beta energy_futures all 0.691 0.985 0.867 0.986 60
288 grok-3-beta energy_futures coa 0.878 0.885 0.760 0.912 5
289 grok-3-beta energy_futures csk 0.797 0.856 0.908 0.900 5
290 grok-3-beta energy_futures ctr 0.278 0.908 0.673 0.922 5
291 grok-3-beta energy_futures dis 0.514 0.949 0.773 0.923 5
292 grok-3-beta energy_futures eco 0.954 0.919 0.529 0.919 5
293 grok-3-beta energy_futures eld 0.805 0.908 0.821 0.902 5
294 grok-3-beta energy_futures far 0.515 0.935 0.754 0.917 5
295 grok-3-beta energy_futures fis 0.834 0.921 0.708 0.928 5
296 grok-3-beta energy_futures lan 0.552 0.927 0.692 0.926 5
297 grok-3-beta energy_futures par 0.901 0.937 0.836 0.942 5
298 grok-3-beta energy_futures sub 0.857 0.910 0.782 0.930 5
299 grok-3-beta energy_futures vil 0.640 0.907 0.367 0.918 5
300 grok-3-beta zh_winterthur all 0.759 0.988 0.855 0.987 60
301 grok-3-beta zh_winterthur coa 0.580 0.926 0.806 0.896 5
302 grok-3-beta zh_winterthur csk 0.801 0.738 0.729 0.835 5
303 grok-3-beta zh_winterthur ctr 0.840 0.885 0.721 0.894 5
304 grok-3-beta zh_winterthur dis 0.614 0.831 0.733 0.883 5
305 grok-3-beta zh_winterthur eco 0.467 0.793 0.763 0.857 5
306 grok-3-beta zh_winterthur eld 0.771 0.914 0.835 0.917 5
307 grok-3-beta zh_winterthur far 0.752 0.907 0.835 0.899 5
308 grok-3-beta zh_winterthur fis 0.647 0.908 0.567 0.925 5
309 grok-3-beta zh_winterthur lan 0.825 0.901 0.868 0.907 5
310 grok-3-beta zh_winterthur par 0.877 0.929 0.781 0.941 5
311 grok-3-beta zh_winterthur sub 0.726 0.904 0.881 0.920 5
312 grok-3-beta zh_winterthur vil 0.878 0.833 0.781 0.910 5

Model/Survey DRI Plots

Survey/Role DRI Plots

Top models

Bottom models

Permutation tests

Surveys and Roles: Are models trully consistent across roles?

In this first permutation test, we explore the likelihood that the consistency, measured by DRI, is due to chance.

## Warning: Using `bins = 30` by default. Pick better value with the argument
## `bins`.

Number of significant (p < 0.05) roles across the 3 surveys.
role sig
csk 2
dis 2
far 2
lan 2
vil 2
coa 3
ctr 3
eco 3
eld 3
fis 3
par 3
sub 3
Number of significant (p < 0.05) surveys across the 12 roles
survey sig
ccps 10
zh_winterthur 10
energy_futures 11
Survey/Role Permutation Summary
obs_dri p n min max median iqr mean sd se ci survey role
0.045 0.0 10 0.017 0.030 0.022 0.005 0.023 0.005 0.001 0.003 ccps eco
0.158 0.0 10 0.124 0.146 0.136 0.008 0.136 0.007 0.002 0.005 ccps coa
0.293 0.0 10 0.224 0.271 0.236 0.013 0.240 0.014 0.004 0.010 ccps ctr
0.129 0.0 10 0.108 0.126 0.119 0.004 0.118 0.005 0.002 0.003 ccps dis
0.118 0.0 10 0.086 0.114 0.100 0.013 0.100 0.009 0.003 0.007 ccps eld
-0.045 0.0 10 -0.058 -0.048 -0.052 0.002 -0.053 0.003 0.001 0.002 ccps fis
0.231 0.0 10 0.199 0.221 0.206 0.014 0.208 0.009 0.003 0.006 ccps lan
0.114 0.0 10 0.095 0.112 0.104 0.008 0.104 0.005 0.002 0.004 ccps par
0.035 0.0 10 -0.003 0.029 0.010 0.018 0.010 0.011 0.003 0.008 ccps sub
0.144 0.0 10 -0.041 0.130 0.028 0.037 0.030 0.049 0.015 0.035 ccps csk
-0.041 0.0 10 -0.125 -0.089 -0.112 0.019 -0.108 0.012 0.004 0.009 energy_futures eco
-0.006 0.0 10 -0.088 -0.038 -0.056 0.021 -0.060 0.017 0.005 0.012 energy_futures coa
0.057 0.0 10 -0.073 -0.005 -0.026 0.021 -0.028 0.020 0.006 0.014 energy_futures ctr
-0.031 0.0 10 -0.077 -0.046 -0.068 0.005 -0.065 0.009 0.003 0.007 energy_futures dis
0.092 0.0 10 0.016 0.082 0.058 0.033 0.055 0.022 0.007 0.016 energy_futures eld
0.000 0.0 10 -0.087 -0.030 -0.056 0.017 -0.058 0.016 0.005 0.012 energy_futures far
0.085 0.0 10 0.031 0.050 0.042 0.006 0.040 0.006 0.002 0.004 energy_futures fis
0.160 0.0 10 0.093 0.125 0.103 0.005 0.105 0.011 0.003 0.008 energy_futures par
0.215 0.0 10 0.092 0.156 0.144 0.016 0.134 0.022 0.007 0.016 energy_futures sub
0.029 0.0 10 -0.046 -0.009 -0.025 0.006 -0.026 0.009 0.003 0.007 energy_futures vil
0.287 0.0 10 -0.019 0.097 0.018 0.060 0.023 0.041 0.013 0.029 energy_futures csk
-0.047 0.0 10 -0.064 -0.052 -0.058 0.004 -0.058 0.003 0.001 0.002 zh_winterthur eco
-0.093 0.0 10 -0.122 -0.110 -0.118 0.007 -0.117 0.004 0.001 0.003 zh_winterthur coa
-0.009 0.0 10 -0.083 -0.032 -0.062 0.016 -0.062 0.014 0.005 0.010 zh_winterthur ctr
0.089 0.0 10 0.035 0.086 0.063 0.019 0.064 0.015 0.005 0.011 zh_winterthur eld
-0.131 0.0 10 -0.172 -0.148 -0.161 0.012 -0.159 0.008 0.003 0.006 zh_winterthur far
-0.185 0.0 10 -0.197 -0.189 -0.192 0.003 -0.192 0.003 0.001 0.002 zh_winterthur fis
0.133 0.0 10 0.098 0.131 0.123 0.012 0.120 0.010 0.003 0.007 zh_winterthur lan
-0.221 0.0 10 -0.246 -0.233 -0.240 0.006 -0.240 0.004 0.001 0.003 zh_winterthur par
-0.128 0.0 10 -0.195 -0.144 -0.170 0.020 -0.171 0.016 0.005 0.012 zh_winterthur sub
-0.278 0.0 10 -0.301 -0.281 -0.295 0.004 -0.293 0.006 0.002 0.004 zh_winterthur vil
0.121 0.1 10 0.097 0.123 0.104 0.010 0.107 0.009 0.003 0.006 ccps far
0.048 0.1 10 0.022 0.051 0.038 0.004 0.038 0.007 0.002 0.005 ccps vil
0.194 0.1 10 -0.029 0.205 0.022 0.065 0.047 0.072 0.023 0.052 zh_winterthur csk
-0.003 0.2 10 -0.042 0.027 -0.018 0.026 -0.015 0.024 0.008 0.017 energy_futures lan
-0.231 0.2 10 -0.241 -0.228 -0.234 0.004 -0.235 0.005 0.001 0.003 zh_winterthur dis

Models and Surveys: Which models are consistent across roles?

## Warning: Using `bins = 30` by default. Pick better value with the argument
## `bins`.

Survey/Model Permutation Summary
obs_dri p n min max median iqr mean sd se ci survey model
0.417 0.0 10 -0.270 -0.121 -0.200 0.111 -0.202 0.058 0.018 0.042 ccps claude-3-5-sonnet-20241022
0.676 0.0 10 -0.152 0.026 -0.117 0.071 -0.092 0.058 0.018 0.041 ccps claude-3-7-sonnet-20250219
0.427 0.0 10 -0.291 -0.147 -0.250 0.074 -0.229 0.053 0.017 0.038 ccps grok-3-beta
0.711 0.0 10 -0.086 0.073 -0.061 0.021 -0.042 0.056 0.018 0.040 ccps gemini-2.5-flash
0.123 0.0 10 -0.343 -0.212 -0.318 0.082 -0.292 0.052 0.016 0.037 ccps gpt-4o-mini
-0.238 0.0 10 -0.520 -0.364 -0.476 0.065 -0.465 0.050 0.016 0.036 ccps command-r-08-2024
-0.171 0.0 10 -0.475 -0.328 -0.418 0.045 -0.414 0.043 0.014 0.031 ccps claude-3-haiku-20240307
0.497 0.0 10 -0.284 -0.138 -0.234 0.023 -0.222 0.045 0.014 0.032 energy_futures claude-3-5-sonnet-20241022
0.591 0.0 10 -0.200 -0.037 -0.167 0.039 -0.148 0.055 0.017 0.039 energy_futures claude-3-7-sonnet-20250219
0.691 0.0 10 -0.131 0.030 -0.110 0.041 -0.086 0.057 0.018 0.041 energy_futures grok-3-beta
0.527 0.0 10 -0.213 0.067 -0.111 0.121 -0.098 0.086 0.027 0.062 energy_futures gemini-2.5-flash
-0.010 0.0 10 -0.243 -0.139 -0.172 0.050 -0.183 0.037 0.012 0.026 energy_futures gpt-4o-mini
-0.137 0.0 10 -0.215 -0.179 -0.194 0.014 -0.193 0.011 0.004 0.008 energy_futures gpt-3.5-turbo
0.075 0.0 10 -0.065 0.006 -0.030 0.038 -0.030 0.025 0.008 0.018 energy_futures command-r-08-2024
-0.182 0.0 10 -0.406 -0.272 -0.360 0.083 -0.350 0.052 0.016 0.037 energy_futures claude-3-haiku-20240307
0.624 0.0 10 -0.146 -0.007 -0.124 0.030 -0.103 0.050 0.016 0.036 zh_winterthur claude-3-5-sonnet-20241022
0.649 0.0 10 -0.115 0.169 -0.079 0.096 -0.042 0.093 0.029 0.066 zh_winterthur claude-3-7-sonnet-20250219
0.759 0.0 10 -0.018 0.265 0.105 0.131 0.087 0.093 0.029 0.066 zh_winterthur grok-3-beta
0.677 0.0 10 -0.154 0.120 -0.044 0.126 -0.057 0.087 0.028 0.062 zh_winterthur gemini-2.5-flash
-0.112 0.0 10 -0.543 -0.349 -0.480 0.086 -0.462 0.064 0.020 0.046 zh_winterthur gpt-4o-mini
-0.284 0.0 10 -0.404 -0.361 -0.378 0.015 -0.379 0.013 0.004 0.010 zh_winterthur gpt-3.5-turbo
-0.520 0.0 10 -0.680 -0.586 -0.629 0.049 -0.635 0.030 0.010 0.022 zh_winterthur claude-3-haiku-20240307
-0.680 0.2 10 -0.722 -0.633 -0.697 0.026 -0.694 0.027 0.009 0.020 zh_winterthur command-r-08-2024
-0.218 0.4 10 -0.258 -0.127 -0.224 0.035 -0.219 0.038 0.012 0.027 ccps gpt-3.5-turbo

All models seem to be consistent across roles. None of the 10,000 permutations led to a higher DRI than the observed DRI, suggesting that the observed value is likely not due to chance.

References